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Abstract 

For ordinal outcomes, we construct sequences of alternative hypotheses in increasing depar¬ 
tures from the sharp null hypothesis of zero treatment effect on each experimental unit, to help 
assess the powers of randomization tests in randomized treatment-control experiments. 
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INTRODUCTION 


Introduced by Fisher (1935), randomization tests are useful tools for causal inference, because they 


assess the statistical significance of treatment effects without making any assumptions about the 
underlying distribution of the outcome. Early theories on randomization tests were developed by 


Pitman (1938) and Kempthorne (1952), which showed that many statistical procedures can be 


viewed as approximations of randomization tests. To quote Bradley (1968), “[a] corresponding 
parametric test is valid only to the extent that it results in the same statistical decision [as the 
randomization test].” A crucial advantage of randomization tests is their abilities to handle non¬ 
standard (e.g., ordinal) outcomes. However, there appears to be limited research on how to assess 
the powers of randomization tests for ordinal outcomes. 


Several researchers like Miller (2006), have pointed out that in many randomized experiments. 


experimental units cannot be viewed as a random sample drawn from a hypothetical population. 
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Therefore it is important to restrict the scope of inference to the finite population of experimental 


units. The potential outcomes framework (Neyman 1923; Rubin 1974) makes randomization tests 
easy to interpret, and more importantly, helps us recognize its role in making finite-population 
inference. However, it does not naturally permit the assessment of the powers of randomization 
tests, which requires constructing alternatives to the sharp null hypothesis of zero treatment effect 
on each experimental unit. The existing literature (e.g., Lehmann||1975 Rosenbaum||2010 ) assesses 
the powers of randomization tests by invoking infinite population models, primarily to circumvent 
the difficulty associated with construction of finite population alternatives. Such a construction 
requires specifying the potential outcomes for all experimental units, and is thus considered “a 


thankless task” by experts (Rosenbaum 2010). Such a “thankless task” can be made easy by 
invoking the assumption of independence of potential outcomes, as in some existing literature (e.g., 
Cheng||2009 Agresti||2010). However, the impact of association between potential outcomes on the 


powers of randomization tests in a finite population setting has never been investigated before. 

In this paper, we demonstrate that it is indeed possible to construct alternative populations of 
ordinal potential outcomes without invoking the independence assumption. We propose a procedure 
to construct alternative hypotheses for ordinal outcomes, which is particularly useful in the finite 
population setting, but is also applicable to infinite populations. Moreover, unlike the existing 


literature (e.g., Cheng 2009; Agresti 2010) which assume independent potential outcomes, our 
construction procedure takes into account the dependence structure of the potential outcomes and 
demonstrates (through simulation studies) that the association indeed does affect the powers of 
randomization tests in a finite population setting. 

The paper proceeds as follows. Section [previews randomization tests of the sharp null hypothe¬ 
sis for ordinal outcomes. Section introduces two measures quantifying departures from the sharp 
null hypothesis, discusses their relationships to the powers of randomization tests, and proposes a 
procedure to construct alternative hypotheses in closed forms. Section reports the results of a 
simulation study that demonstrates how to use the proposed construction procedure to assess the 
powers of randomization tests. Section presents some concluding remarks. 
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2. RANDOMIZATION TESTS FOR ORDINAL OUTCOMES 


2.1. Potential Outcomes, Sharp Null Hypothesis and Randomization Test 

We consider a completely randomized experiment with N units, a binary treatment and an ordinal 
outcome with J categories labeled as 0,..., J — 1, where 0 and J — 1 are the “worst” and “best” 


categories, respectively. Under the Stable Unit Treatment Value Assumption (Rubin 1980) that 


there is only one version of the treatment and no interference among units, we define the pair 
{U(1),U(0)} as the potential outcomes of the ith unit under treatment and control. Let 


Pki = pr{U(l) = k,Yi{0) = 1} (fe,/ = 0,1,..., J - 1) 


be the probability of potential outcomes k and I, under treatment and control. Here, the proportion 
or probability notation pr(-) can be defined for a finite population with N units, or for an infinite 
population. The J x J probability matrix P = {pki)o<k i<j-i summarizes the joint distribution of 
the potential outcomes, and plays a crucial role in our later discussion. Let 


J-l .7-1 

Pk+ = '^Pki', p+i=J2Pk'i (A:,/= 0,1,..., J-1). 
l'=0 k'=0 


The vectors pi = {po+, ■.. and po = (p+Oj • • • characterize the marginal distri¬ 

butions of the potential outcomes under treatment and control. 

Using the potential outcomes, we express the sharp null hypothesis as U(0) = U(l) for all i. 
Under the sharp null hypothesis, the probability matrix P is diagonal with pj_|_ = pjj = p+j, for 
all j = 0,1,..., J — 1. To test the sharp null hypothesis, we use data from completely randomized 
experiments with Ni units assigned to treatment. For the fth unit, we denote its treatment indicator 
as Wi, and its observed outcome is consequently 17°^® = IFjU(l) + (1 — VUj)U(O). For each j, let n^j 
and nij respectively represent the numbers of units exposed to control and treatment with observed 
outcome j. Given the observed data, we first choose a suitable test statistic, typically a “measure 
of extremeness” ( Brillinger et al.||1978 ), and then obtain a p-value by comparing the observed value 
of the test statistic to its randomization distribution. 
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3. CONSTRUCTION OF ALTERNATIVE HYPOTHESES 


To evaluate the powers of randomization tests, we need to construct alternatives to the sharp null 
hypothesis. We can violate the sharp null hypothesis in two distinct ways: 

1 . different marginal probabilities, i.e., pi ^ Po] 

2. identical marginal probabilities, but nonzero off diagonal elements in P. 


For example, consider the following probability matrices 


/ 1 1 
6 6 


Pi = 


1 \ 


0 


1 1 
6 6 


V 6 


^00 


= 


1 \ 


\ 6 6 


P. = 


/ 1 1 1 \ 

9 9 9 

111 
9 9 9 

.111 
\ 9 9 9 / 


( 1 ) 


all of which violate the sharp null hypothesis. In particular. Pi has different marginal probabilities; 
P 2 and P 3 have identical marginal probabilities but nonzero off diagonal elements. 

Inspired by this example, to construct alternative hypotheses we introduce two measures quan¬ 
tifying violations of the sharp null hypothesis. We use the Hellinger distance thd ( Helling^] 1909 ) 
to quantify the difference between the marginal probabilities: 


J-i 


1/2 


rHD (Pi,Po) = < ^ X] {Pj+ ~P+j 


j=0 


Other choices include the Kullback-Leibler divergence and total variance distance. Under the sharp 
null hypothesis thd = 0 , and therefore nonzero thd implies violations of the sharp null hypothesis. 
However, thd relies solely on the marginal probabilities, ignoring the joint distribution of the 
potential outcomes. For example, the probability matrices P 2 and P 3 in Q violate the sharp null 
hypothesis although thd = 0. To address this issue we use Cohen’s Kappa ([Cohen 1960): 


; (P) = {tr(P) - pjpo} / (1 - pjpo) 


( 2 ) 


where tr(-) is the trace function. Cohen’s k relies on the probability matrix P, and under the sharp 
null hypothesis k = 1 because P is diagonal. 
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We now construct sequences of alternative hypotheses by varying the Hellinger distance thd 
and Cohen’s k . To be more specific, we follow a two-step procedure: 

1 . construct a sequence of marginal probabilities, in an increasing order of thd', 

2 . for fixed marginal probabilities, construct a sequence of probability matrices in an increasing 
order of k, which involves the following sub-steps: 

(a) minimize and maximize k subject to the following constraints: 

J-i J-i 

'^Pk'i=P+i, '^Pki'=Pk+, Pki>0 (/c,/= 0,1,..., J-1); 

k'=0 l'=0 

(b) use a convex combination of the minimizer and maximizer to construct probability ma¬ 
trices with intermediate values of n. 


Step 1 helps access the impact of thd on the powers of randomization tests, and Step 2 further 
helps access the impact of n. For fixed marginal probabilities, Sub-step (a) studies the two extreme 
cases of “most” and “least” violations of the sharp null hypothesis, and Sub-step (b) addresses the 
“in between” cases. Therefore, this procedure provides a relatively complete picture of violations 
of the sharp null hypothesis. 

For given marginal probabilities pi and po, the minimization problem in the above procedure is 
somewhat intuitive. Consider the probability matrix Pj with independent potential outcomes, i.e., 
Pki = Pk+P+i for all A;, L If we are not interested in distributions with negatively associated potential 
outcomes, Pj minimizes k as zero. The maximization problem is, however, non-trivial, and almost 
intractable unless some restrictions are imposed on P. For the purpose of simplification, we make 
the following assumption: 


Assumption 1. (Stochastic Dominance) For all j = 1,..., J — 1, Pk+ > Ylii=j P+i- 

As an illustration, consider the three probability matrices in ([^. Among these matrices, Pi 
does not satisfy stochastic dominance. Besides the advantage of reducing the number of possible 
alternative hypotheses, in applied research the stochastic dominance pattern occurs frequently (e.g., 
Bradley et al.jl9^ Bajorski and Petkau 1999), and is termed “positive distributional causal effect” 


m 


Ju and Geng (2010). Because of the aforementioned technical convenience and the practical 


5 










importance, we first focus on marginal probabilities pi and po that satisfy Stochastic Dominance, 
and then discuss general marginal probabilities. 


We further simplify the maximization problem by restricting the maximizer to be lower tri¬ 
angular by utilizing a well-known result that for any marginal probabilities satisfying Stochastic 
Dominance, there exists a probability matrix that is lower triangular. This existence result is a 


special case of Strassen’s theorem (Strassen 1965 Lindvall 1992), which was utilized by Rosen¬ 


baum (2001). We are now in a position to state and prove the following theorem that provides the 


maximum value of k, and the maximizer itself: 


Theorem 1. For any J > 2, given marginal probabilities pi and Pq satisfying Stochastic Domi¬ 
nance, there exists a lower triangular probability matrix achieving the upper bound of k : 

k{P)<k (P+) = < ^ min {pk+,p+k) - pjpo [ / (l - PiPo) ■ (3) 

lfc=o J 

Proof of Theorem The proof consists of two parts. We first show that ([^ is an upper bound of 
K, and then construct a probability matrix attaining it. 

For all k = 0,... ,J — 1, the diagonal element pkk of matrix P cannot be greater than either 
Pk+ or p+fc, i.e., Pkk < m.m{pk+,p+k), which implies 

.7-1 

tr (P) < min {pk+,p+k) ■ (4) 

k=0 

Substituting (Q in ([^ yields the upper bound of k in (|^. 

We then sequentially construct a J x J lower triangular matrix with fixed row rums pi and 
column sums po- We start with the last column and proceed backwards. At any point in the 
construction, we denote the element in the matrix already filled by pki and those have not by pki. 
First, for the last column with index J — 1, only the last entry needs to be filled, and we set it equal 
to the corresponding column sum, i.e., pj-ij-i = p+j-i. Next, for all r = 1,..., J — 1, given all 
elements in the last r columns are already filled, we consider the problem of filling in the elements 
of column with index / = J — r — 1, as shown in Table At this point, the already filled elements 
in the matrix are pki, where k = 0,..., J — 1 and I = J — r,..., J — 1. 

To fill the column with index I = J — r — 1, note that all entries for k < J — r — 1 will be equal 
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Table 1: Filling in the column with index I = J — r — 1 when the last r columns are already filled 


Row index 
(fc) 

0 

Column ind 
J—r — l 

ex (/) 

J — r 

J-1 

Row Sum 

Pk+ 

0 

Poo 

0 

0 

0 

Po+ 

J — r — l 

1 

• • 

1 

O 

min(pj_r-i,+ ,p+.j-r-i) 

0 

0 

P J—r—l,-\- 

J — r 

Pj—r,0 

Pj—r, J—r — l ^ 

Pj—r,J—r 

0 

Pj—r,-\- 

j-i 

PJ-1,0 

PJ-l,J-r-l =? 

PJ-l,J-r ■ 

• Pj-i,j-i = P+,j-i 

pj-i,+ 

Column sum 

P+0 

P+,J-r-l 

P+,J—r 

P+j-i 

1 


to zero. We set the diagonal element with row index k = I = J — r — 1 io he the minimum of 
the corresponding row and column sums, i.e., pj-r-i,J-r-i = min (pj_r-i,+ ,p+,j-r-i)- Now the 
difference needs to be distributed over the remaining entries 

of the column. Note that this difference is zero if min = P+,j-r-i- Therefore, 

for all /c = J — r,..., J — 1, we make the entry pk^j-r-i proportional to the “remaining balance” 
— min(pj_r-i,+ ,p+,j-r-i), where we choose the proportionality constant as 

2^1=0 Pkl /c-N 

Z^k'>J-r 2^1=0 Pk'l 

that is, the ratio of the sum of empty entries in the row with label k and the sum of empty entries 
in all rows below the one labeled J — r — 1. Both the numerator and denominator of Q can be 
expressed in terms of the given marginals and the already filled-in entries in the last r columns: 


J—r—l 

Pki=Pk+- ^ Pkh 
1=0 l>J—r 


J—r—l 

pk'i 

k'>J—r /=0 


E 

k'>J—r 



X] Pk'l 

l>J-r 


( 6 ) 


and hence can be computed uniquely. The construction method, ([^ or (©, eventually leads to the 
following iterative imputation equation for all A: = J — r,..., J — 1: 


Pk,J-r-l — ^ 


Pk-\- y J/> J—f Pkl 


'y2ik'>J—r \Pk'+ 'yyiii>j—rP^' 


{P+,J-r-l - mm{pj_r-l,+ ,P+,J-r-l)} ■ 


(7) 
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We need to show the constructed matrix 


0 ^ 

0 

\ PJ-1,0 PJ-1,1 • • • 

indeed satisfies (i) pki > 0 for all fc, / = 0,..., J — 1, (ii) the equality in condition Q, for which 
a sufficient condition is pkk = min {pk+,p+k) for all A: = 0,..., J — 1, (hi) the vector of column 
sums is po, i.e., Ylk=oPki — P+i for all / = 0,..., J — 1, and (iv) the vector of row sums is pi, i.e., 
'E'i=o Pki = Pk+ for all /c = 0,..., J - 1. 

Among (i)-(iv) described above, (i)-(iii) follow directly by the construction of P+ described 
above. We need only to prove that "YhiZo Pki = Pk+ for all k = 0,..., J — 1. By Stochastic 
Dominance, we have po+ < P+o^ implying that poo = Po+- Therefore the sum of the first row of 
P+ is po+- Now for all A: = 1,..., J — 1, by substituting r = J — 1 in ([^, or by filling up the first 
column given the last J — 1, we have 


Pr = 


Poo 

PlO 


0 

Pll 


Pk-\- ~ ^1=1 Pki 

Z^fe=l [Pk+ — 2_^z=i Pki 
- 

Pk+ - 2^1=1 Pki 


PkO = 


Pk+ ~ XyZ=l Pki 

(1 -po+) - (1 -Poo) 


(P+0-P0+) = 


(p +0 -po+) 


J-1 


P+O - Po+ 


(p+0 - Po+) = Pk+ -'^Pkh 


1=1 


which implies that Yli=o Pki = Pk+- The proof is complete. □ 

In the above proof of Theorem we suggest a way to constructing the maximizer P^. Next 
we discuss the uniqueness of P+. By restricting P+ to be lower triangular and its (j + l)th 
diagonal element pjj to be min (pj_|_,p+j), what remain to be determined are the (J — l)J/2 off 
diagonal elements. Note that there are (2 J — 3) constraints associated with them. The equality 
(J — l)T/2 = 2 J — 3 holds if and only if J = 2 or 3. 

The case with J = 2 corresponds to binary outcomes, which occur frequently in both methodol¬ 
ogy and applied research. For a recent discussion of finite population inference for binary data, see 
Ding and Dasgupta 


(2015). The following corollary provides the maximizer under J = 2. Although 










it is a special case of Theorem we provide a direct proof to rigorously show the uniqueness of 
the maximizer. 


Corollary 1. For J = 2, given marginal probabilities pi and po that satisfy Stochastic Dominance, 
the following matrix is the unique maximizer of k: 


= 


/ 


V 


Po+ 0 
Pi+ - P+i P+i 


( 8 ) 


Proof of Corollary^ Because pi and po satisfy Stochastic Dominance, we have po+ < P+o and 
> p+i, implying that the diagonal elements of the maximizer are poo = Po+ and pn = p+i. 
Because the row sums of the maximizer are pi, we uniquely determine the entries of the maximizer, 
as shown in Q. The maximizer has nonnegative entries because pi+ > p+i, and its column sums 
are po because po+ + Pi+ — P+i = P+o- The proof is complete. □ 

The case with J = 3 corresponds to three-level outcomes, which are also important in practice. 
For example, in a clinical trial we can describe the status of a patient as “deterioration,” “no 


change” or “improvement” (Bajorski and Petkau 1999). The following corollary gives the maximizer 


for J = 3. Again, we provide a direct proof. 


Corollary 2. For J = 3, given marginal probabilities pi and po that satisfy Stochastic Dominance, 
the following matrix is the unique maximizer of k: 


P+ 


/ 

\ P2+ 


Po+ 0 0 

Pi+- min(p+i,pi+) min(p+i,pi+) 0 

P +2 - {p+i - min (p+i,pi+)} p+i - min (p+i,pi+) p+2 y 


(9) 


Proof of Corollary^ Because pi and po satisfy Stochastic Dominance, we have po-i- < P-i-o and 
P 2 + > P+ 2 -, which implies that the diagonal elements of the maximizer are poo = Po+i Pii = 
min (p+i,pi+), and P 22 = P+ 2 - First, because the first row sum and third column sum are respec- 
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tively po+ and p+ 2 , the maximizer is in the following form: 


/ 


Po+ 


0 


0 


\ 


P+- ? min(p+i,pi+) 0 

? ? P+2 ^ 


where “?” denotes an entry yet to be determined. Second, because the second row sum and column 
sum are respectively and p+i, the maximizer is in the following form: 


/ 


Po+ 


0 


0 


\ 


P+- pi+- min(p+i,pi+) min(p+i,pi+) 0 

\ ? P -,1 - min (p+i,pi+) p +2 y 


Third, because the third row sum is P 2 +, we uniquely determine the maximizer, as in (§. Fourth, 
P+ has nonnegative entries, because 

P 2 + - p +2 - {p+i - min (p+i,pi+)} = min {p 2 + - p+2,P+o - Po+) > 0. 

Finally, P+ has row sums pi, and column sums Pq, because 

Po+ +P 1 + - min (p+i,pi+) +P 2 + - p +2 - {p+i - min {p+i,pi+)} = p+o- 

The proof is complete. □ 

We end this section by constructing probability matrices with intermediate values of k. Given 
the minimizer P/ and maximizer P+, let Px = XPj + (1 — A)P+. We view A G [0,1] as a sensitiv¬ 
ity parameter, because we cannot estimate it from the observed data. The resulting probability 
matrices have the same marginal probabilities as P/ and P+, and subsequently the same Hellinger 
distances. However, they have different k depending on A because n{Px) = (1 — A)«:(P+). For 
the infinite population framework, our constructed sequence of alternative hypotheses are thus 
{PaIasIo,!]- For the finite population framework, because any entry of a well-defined probability 


10 






matrix P is multiples of 1/iV, we propose a calibration step by letting 


PkiW 


INpkiWi 

N 


p+l ~ Z^k - 


jk'^l 


N 


ifk = l, 


where [-J is the floor function. By definition, the column sums of Px are po- Let A (pi,Po) denote 
the set containing all A’s such that the row sums of Px are pi, and our constructed sequence of 
alternative hypotheses are therefore {i^A}AeA(pi,po)' practice, we can use a grid search to obtain 
an approximation of A (pi,Po)- 


4. A SIMULATION STUDY 


We demonstrate how the above construction facilitates assessment of powers of randomization tests. 


We use the squared Mann-Whitney f7-statistic (Agresti 2002): 


= 


j-i J-i 


-1 2 


EE niknoi {I{k > 1) - I{k < 1)} 


LA:=0 1=0 


Another commonly-used test statistic for categorical data is the y^-statistic. However, we do not 
use it in this paper, because it does not utilize the order information and therefore is less powerful. 

Although closed-form expressions of the powers of randomization test using the statistic are 
difficult to obtain, numerical calculations by Monte Carlo are straightforward, once we determine 
the alternative hypothesis P. We will follow three steps: 

1. under P generate 2 x 10^ independent treatment assignments and obtain the corresponding 
observed data sets; 

2 . for each observed data set calculate the p-value of the randomization test using the observed 
value of and its simulated null distribution; 

3. approximate the power of the U‘^ statistic as the proportion of the p-values that are smaller 
than the significance level a = 0.05. 

In this simulation study, we construct alternative hypotheses using the following four sets of 
marginal probabilities, two with J = 2 and two with J = 3: 
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1. Pi = (3/10, 7/10)T, pq = (3/5, 2/5)T, thd = 0.216; 

2. Pi = (1/2, l/2f, po = (4/5, 1/5)T, thd = 0.227; 

3. Pi = (1/4,1/4, 1/2 )T, po = (2/5, 2/5,1/5)^, thd = 0.227; 

4. Pi = (9/40, 9/40, 11/20)T, po = (2/5, 2/5,1/5)^, thd = 0.261. 


For each case, we let the sample sizes N = 120,160,240, and the sensitivity parameters A = 
0,1/4,1/2,3/4,1. We then construct the probability matrices, which share the same marginals. 
For each probability matrix P\, we use the aforementioned Monte Carlo procedure to calculate the 
powers. On one hand, different cases allow us to study the impact of thd on the powers. On the 
other hand, within each case we can study the impact of k on the powers. The simulation results 
are summarized in Figure from which we draw the following conclusions. For all fixed sample 
sizes, the power functions of Case 2 dominate those of Case 1, and the power functions of Case 
4 dominate those of Case 3. Therefore, for fixed J the power increases as the Hellinger distance 
increases. Furthermore, for fixed marginals and sample size, the power increases as k decreases, or 
equivalently as A increases. However, this dependence becomes weaker as the sample size increases, 
because the power converges to one. 

We can use the demonstrated methodology to compare the power functions of different test 
statistics, and also to determine sample sizes that guarantee a pre-specified power. For instance, in 
Case 3, we cannot guarantee a power of 0.95 with a sample of size 120, but we can with a sample 
of size 160. 

In summary, for a finite population, the power of the randomization test using depends 
on the marginal difference of the potential outcomes as well as the association between them. In 
particular, the power increases as the marginal difference increases, and given the marginals fixed, 
the power increases as the association decreases. Furthermore, the power converges to one as the 
sample size increases, for any case with nonzero marginal difference. The above conclusions appear 
to confirm our intuition, because it should be easier to reject the sharp null hypothesis given larger 
differences between the marginals, and given the marginals fixed, it should be easier to reject the 
sharp null hypothesis given smaller associations between the potential outcomes. Our findings 


conform to Plackett (1977)’s and Chernoff (2004)’s results about the classical 2x2 tables: the 
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marginals of the contingency tables contain limited amount of information about the association 
with finite samples, which becomes negligible asymptotically. 


5. DISCUSSION 

In this paper, we construct sequences of finite populations of ordinal outcomes in increasing depar¬ 
tures from the sharp null hypothesis of no treatment effect. Our construction procedure is useful 
for evaluating of the powers of randomization tests. In particular, our construction procedure takes 
into account the dependence structure of the potential outcomes, whereas existing literature of¬ 
ten assume independent potential outcomes. Through a simulation study, we demonstrate that 
the association between potential outcomes indeed affects the powers of randomization tests. We 
argue that taking into account the association between potential outcomes is crucial for conduct¬ 
ing randomization tests in practice, for example when determining sample sizes that guarantee a 
pre-specified power. 

There are multiple future directions based on our work. First, although we adopt a numerical 
approach, it is possible to derive the asymptotic distribution of the statistic under the sharp null 
hypothesis. Second, we can derive the maximizer of k for general marginal probabilities that do not 
satisfy Stochastic Dominance. Third, we can incorporate covariate information to further improve 
the powers of randomization tests. Fourth, while the Fisherian randomization-based inference is 
a useful first step, Neymanian and Bayesian counterparts of causal inference for ordinal outcomes 


are still needed. For some recent developments, see Lu et al. (2015). 
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